32 research outputs found
Video Time: Properties, Encoders and Evaluation
Time-aware encoding of frame sequences in a video is a fundamental problem in
video understanding. While many attempted to model time in videos, an explicit
study on quantifying video time is missing. To fill this lacuna, we aim to
evaluate video time explicitly. We describe three properties of video time,
namely a) temporal asymmetry, b)temporal continuity and c) temporal causality.
Based on each we formulate a task able to quantify the associated property.
This allows assessing the effectiveness of modern video encoders, like C3D and
LSTM, in their ability to model time. Our analysis provides insights about
existing encoders while also leading us to propose a new video time encoder,
which is better suited for the video time recognition tasks than C3D and LSTM.
We believe the proposed meta-analysis can provide a reasonable baseline to
assess video time encoders on equal grounds on a set of temporal-aware tasks.Comment: 14 pages, BMVC 201
DeepProposals: Hunting Objects and Actions by Cascading Deep Convolutional Layers
In this paper, a new method for generating object and action proposals in
images and videos is proposed. It builds on activations of different
convolutional layers of a pretrained CNN, combining the localization accuracy
of the early layers with the high informative-ness (and hence recall) of the
later layers. To this end, we build an inverse cascade that, going backward
from the later to the earlier convolutional layers of the CNN, selects the most
promising locations and refines them in a coarse-to-fine manner. The method is
efficient, because i) it re-uses the same features extracted for detection, ii)
it aggregates features using integral images, and iii) it avoids a dense
evaluation of the proposals thanks to the use of the inverse coarse-to-fine
cascade. The method is also accurate. We show that our DeepProposals outperform
most of the previously proposed object proposal and action proposal approaches
and, when plugged into a CNN-based object detector, produce state-of-the-art
detection performance.Comment: 15 page
Online Action Detection
In online action detection, the goal is to detect the start of an action in a
video stream as soon as it happens. For instance, if a child is chasing a ball,
an autonomous car should recognize what is going on and respond immediately.
This is a very challenging problem for four reasons. First, only partial
actions are observed. Second, there is a large variability in negative data.
Third, the start of the action is unknown, so it is unclear over what time
window the information should be integrated. Finally, in real world data, large
within-class variability exists. This problem has been addressed before, but
only to some extent. Our contributions to online action detection are
threefold. First, we introduce a realistic dataset composed of 27 episodes from
6 popular TV series. The dataset spans over 16 hours of footage annotated with
30 action classes, totaling 6,231 action instances. Second, we analyze and
compare various baseline methods, showing this is a challenging problem for
which none of the methods provides a good solution. Third, we analyze the
change in performance when there is a variation in viewpoint, occlusion,
truncation, etc. We introduce an evaluation protocol for fair comparison. The
dataset, the baselines and the models will all be made publicly available to
encourage (much needed) further research on online action detection on
realistic data.Comment: Project page:
http://homes.esat.kuleuven.be/~rdegeest/OnlineActionDetection.htm
Estimating greenhouse gas emissions using emission factors from the Sugarcane Development Company, Ahvaz, Iran
Background: Greenhouse gas (GHG) emissions are increasing worldwide. They have harmful effects on
human health, animals, and plants and play a major role in global warming and acid rain.
Methods: This research investigated carbon dioxide (CO2) and CH4 emissions obtained from different
parts of the Hakim Farabi, Dobal Khazaei, and Ramin factories which produce ethanol and yeast.
Seasonal rates of CO2 at the soil surface at the studied sites were estimated from measurements made
on location and at intervals with manual chambers. This study aimed to assess the production rate of
GHG emissions (CH4, CO2) in the sugar production units of Hakim Farabi, Dobal Khazaei, and Ramin
factories.
Results: Mean concentrations of CO2 and CH4 emissions are respectively 279 500.207 and 3087.07 tons/
year from the Hakim Farabi agro-industry, 106 985.24 and 1.14 tons/year at the Dobal Khazaei ethanol
producing factory, and 124 766.17 and 1.93 tons/year at the Ramin leavening producing factory.
Conclusion: Sugar plant boilers and the burning of sugarcane contributed the most CO2 and CH4
emissions, respectively. Moreover, lime kilns and diesel generators showed the least carbon dioxide and
methane emissions, respectively.
Keywords: Carbon Dioxide, Methane, Ethanol, Farms, Global Warmin
Actor and Action Video Segmentation from a Sentence
This paper strives for pixel-level segmentation of actors and their actions
in video content. Different from existing works, which all learn to segment
from a fixed vocabulary of actor and action pairs, we infer the segmentation
from a natural language input sentence. This allows to distinguish between
fine-grained actors in the same super-category, identify actor and action
instances, and segment pairs that are outside of the actor and action
vocabulary. We propose a fully-convolutional model for pixel-level actor and
action segmentation using an encoder-decoder architecture optimized for video.
To show the potential of actor and action video segmentation from a sentence,
we extend two popular actor and action datasets with more than 7,500 natural
language descriptions. Experiments demonstrate the quality of the
sentence-guided segmentations, the generalization ability of our model, and its
advantage for traditional actor and action segmentation compared to the
state-of-the-art.Comment: Accepted to CVPR 2018 as ora
Signal processing based damage detection of concrete bridge piers subjected to consequent excitations
Damage detection at an early stage is of great importance especially for infrastructures since the cost of repair is considerably less than that of reconstruction. The change in stiffness and frequency could obviously indicate the occurrence of damage and its severity. Wavelet transform is a powerful mathematical tool for signal processing which provides more details compared to Fourier transform. In this paper, a model-free output-only wavelet-based damage detection analysis has been performed in order to achieve perturbation of detailed function of acceleration response in bridge piers. First, a nonlinear time-history finite element analysis was performed using 9 consequent earthquake records; from which, time-history acceleration response was derived. Also pushover and hysteresis curves were drawn based on the results. Furthermore, applying wavelet transform to structural response, some irregularities appeared in decomposed detailed function which imply on damage presence in the models. Finally, peak values of details could lead us to time instants of damage